Main types of neural networks
- Artificial neural network (ANN)
- Convolutional neural network (CNN)
- Recurrent neural network (RNN)
“A neural network is the second best way to solve any problem. The best way is to actually understand the problem.” -Unknown
The above example shows a very simplistic convolutional neural network used to detect numbers from an image.
Sidenote: you can create CNNs in r, however they are not fun to make in r.
A neural network is a parallel, distributed information processing structure consisting of processing elements (nodes) interconnected together with unidirectional signal channels. Each node has a single output connection which branches into as many collateral connections as desired. The node’s output signal can be of any mathematical value (can later be converted to non-numerical values). All of the processing that goes on within each node must depend only upon the values of the received input signals arriving at the node via incoming connections and upon values stored in the node’s local memory (Hecht-Nielsen 1992).
Neural networks are a subset of machine learning that are an amalgamation of modern computer science and cognitive psychology. The data processing and analysis is based on the same process that your brains’ neurons undergo to process information. Your neurons require a certain amount of activation energy in order to fire and send information beyond the neuron and along to the axon, synapses, to other neurons and so on, an input requirement known as action potential.
Basic principle of Perceptron model.
Originally, scientists were simply interested if they could recreate how the human brain works; they really didn’t have any desire to make this concept much more than a concept. After WWII and Alan Turing’s creation of what would become the modern day computer, scientists that were able to get their hands on this technology could turn it loose on whatever they could imagine including early neural networks. For the next few decades, the theory of artificial intelligence and neural networks developed faster than technology could to support it until modern computer processors and parallel computing caught up with the all of the theory. Now the opposite seems to be true and computing power is allowing for a lot more applications of neural networks and the sky is the limit for real-world applications.
Neural networks are currently the fastest growing research topic (Google scholar) as well as one of the most well-funded research areas. Most of Nature’s most cited papers are based on neural network research. They are capable of filling in analytical gaps where other statistical methods simply fall short. While they process information the same way our brains do, they are capable of finding patters that we simply are not capable of.
Neural networks vary A LOT in structure, but most of them have relatively similar components and building blocks.
Structure of a simple deep learning model
A node is simply a container for a value with 1+ weighted input connections, with the exception of the input layer which has yet to be passed beyond the first input layer.
The input layer consist of whatever data you are choosing to help predict your desired result. These come in many varieties and number of input layers, but each input node always represents a single floating point value. For the network we will be generating in R, it requires all of our values be between 0 and 1 (requires scaling)
This is our result. For classification-type models (i.e. what do object does the given input describe?), it will generate a value ranging from 0-1. In an ideal world, it would either be 0 or 1 each time, but it usually varies and whichever value it is closest to is the generated prediction. For numerical outputs, we have to rescale to get usable output variables.
library(dplyr)
library(neuralnet)